Skip to main content

Full text of "ERIC ED318062: Difficulty in Learning To Read Speech Spectrograms: The Role of Visual Segmentation."

See other formats


DOCUKSNT REStmE 



ED 318 062 



CS 507 133 



AtmiOR 



Gabrys, Garetn 

Difficulty in Learning To Read Speech Spectrograms? 
The Role of Visual Segmentation. 
Pittsburgh univ. , Pa. Learning Research and 
Development Center. 

Office of Naval Research, Washington, D.C. 

LRDC/PITT/lMP-1 

Feb 90 

N00014-86-K~0351 
40p. 

Reports - Research/Technical (143) 



INSTITUTION 



SPONS AGENCY 



REPORT NO 
PUB DATE 
CONTRACT 
NOTE 



PUB TYPE 



EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



MF01/PC02 Plus Postage. 

Communication Research j Context Clues? *Learning 
Problems? *Souna Spectrographs? Speech? Visual 
Stimuli 

Salience? Salience Effects? Segmentation Theory? 
Speech Research? University of Pittsburgh PA? Visual 
Cues 



ABSTRACT 



This study was conducted to demonstrate that a 



context-dependent discrimination can produce learning difficulty in 
pseudo- spectrogram reading task and to look at what contribution 
segmentation makes to that difficulty. Experiment one involved 10 
subjects recruited from the University of Pittsburgh, who were shown 
pseudo-spectrogram patterns and then asked to respond by making 
selections from a screen menu on a computer. Results indicated that 
context-dependent discrimination can be difficult to learn. 
Experiment two was to try to determine whether the learning 
difficulty observed in experiment one was due to context-dependent 
segmentation, to some other factor such as salience or ta~'< demands, 
or to some interaction of these factors. Fifteen other subjects were 
given a stack of 32 different patterns and asked to circle the 
important parts. Results indicated that lack of salience may play an 
important part in making this type of skill difficult to learn. 
Results of the two experiments point to several factors which can 
affect the difficulty of learning to read speech spectrograms. 
Learning difficulty may be affected by the interaction of 
segmentation with cue salience and task demands. The main conclusion 
was to confirm the influence of segmentation on learning difficulty 
in speech spectrogram reading. (Two figures are included, and 19 
references are attached.) (MG) 



Reproductions supplied by EDRS are the best that can be made 

from the original document. 



University of Pittsburgh 

Learning Research and Development Center 



Difficulty in Learning to Read Speech Spectrograms: 
The Role of Visual Segmentation 



Gareth Gabiys 
Learning Research and Development Center 
University of Pittsburgh 

TechniaiL Report No. URDC/ PITT/ IMP- 1 
Cognitive Science Program 
Office of Naval Research 
Contract No. NOOOM-Se-K-OSei 



PERMISSION HEPnODUCE TH.o 



JX 



0 

■ C 



The research reported in this document was supported by the Cognitive 
Science Program. Ofiice of Naval Research, under the contract number 
noted above. The views stated or implied by this report are not necessarily 
those of the United States Government or the United States Navy, which 
has neither reviewed nor endorsed the reported findings. 

Reproduction in whole or part is permitted for any purpose of the U.S. 
Government. 



Approved for public release; distribution unlimited. 



ERIC 



BEST COPY AVAILABLE 



Difficulty in Learning to Read Speech Spectrograms: 
The Role of Visual Segmentation 



Gareth Gabiys 
Learning Research and Development Center 
University of Pittsburgh 

Techni<xil Report No. LRDC/PnT/lMP-l 
Cognitive Science Program 
Qflfce of Naval Research 
Contract No, N00014-8&K'0361 



The research reported In this document was supported by the 
Cognitive Science Program, OfBce of Naval Research, under the 
contract number noted above. The views stated or implied by this 
report are not necessarily those of the United States Government or 
the United States Navy, which has neither reviewed nor endorsed the 
reported findings. 

Reproduction in whole or part is permitted for any purpose of the U.S. 
Government. 

Approved for public release: distribution unlimited. 



DNCLASSIFia? 

SeCURtTY ClASSIFtCATtON OF THIS PAdE 



REPORT OOCUMENTATIOM PAGE 



form Approved 
OM&No. Q7Q4'0f8B 



la REPORT SECURITY CLASSIFICATION 



}b RESTRICTIVE MARKINGS 



2a. SECURITY CLASSIFICATION AUTHORITY 



2b. DECLASSIFICATION /DOWNGRAOtNG SCHEDULE 



3. DISTRIBUTION /AVAILABILITY OF REPORT 

Approved for public release; distribution 
unlljsiited* 



4 f»ERK>RMINe ORGANIZATION REPORT NUM9ER(S) 

UPITT/LRDC/ONR/IMP-l 



5 MONITORING ORGANIZATION REPORT NUM8ER{S> 



€m mm OF PERFORMING ORGANIZATION 

Leamiiig Research & Developmen 
Center > Univ. o£ Pittsburgh 



6b OFFICE SYMBOL 
(ff applicMe) 



7a. NAME OF MONITORING ORGANIZATION 

Cognitive Science Program^ 
Office of Naval Research (Code 1U2CS) 
800 North QuincY Street 



6c ADDRESS {City, State^ and ilRCcde) 

Pittsburgh, Pennsylvania 15260 



7b ADDRESS {City^ state, and ZfPCocfe) 



Arlington, VA 22217-^5000 



8a. NAME OF FUNDING /SPONSORING 
ORGANIZATION 



8b. OFFICE SYMBOL 
(tf app//caWe) 



9 PROCUREMENT INSTRUMENT IDENTIFICATION NUM8SR 



N00014-86-K-0361 



8c ADDRESS (City. State, and ZfPCode> 



10 SOURCE OF FUNDING NUMBERS 



PROGRAM 

^nmm NO 
6U53N 



PROiECT 
NO 

RR04206 



TASK 
NO 

RR04206-01 



WORK UNIT 
ACCESSION NO 



11. TITLE (Include Security Clas$ificat$ort) 
Difficulty in Learning to Read Speech Spectogramst The Role of Visual Segmentation 

12 PERSONAL AUTHOR(S) 

Gareth Gabry s 



13a. TYPE OF REPORT 

Tefihnlflftl 



)3b TIME COVERED 
FROM TO 



14. DATE OF REPORT (Year. Month. Day) 

1990 » February > 7 



15 PAGE COUNT 

30 



16 SUPPLEMENTARY NOTATION 



17. COSATI CODES 


WELD 


GROUP 


SUB-GROUP 


05 


09 











18. SUBJECT TERMS (Continue on reverse tf necessary and tdentify by block number) 

Training difficulty; context-^dependent; visual segment at iof; 
spectrograms J speech spectrogram reading; learning diffi- 
culties; task demands; salience. 



19 ABSTRACT (Continue on reverse if necessary and identify by block number) Xhis VOrk examines possible SOurceS 

of training difficulty encountered by learners of speech spectrogram reading • Such 
difficulty has been attributed to the context-^dependent nature of the visual segmentation 
of spectrogram patterns (Liberman et al» 1968) » and suggestions by researchers of other 
difficult skills (Biederman & Shiffrar^ 1983) have also implicated visual segmentation. 
In both cases » the discriminations necessary to distinguish important parts can be easily 
made once identified » but are enormously difficult to discover. The experiments presenter 
here used a pseudo-spectrogram reading task which varied the segmentation rules subjects 
were required to discover* Experiment 1 found that considerable learning difficulty 
could be produced by this task, but confounded the source of that difficulty among 
several factors* Tlie second experiment attempted to identify the sources of the diffi- 
culty. Segmentation was found to contribute significantly • The salience of the impor- 
tant cues> and» potentially, the demands of the learning task were also found to incr«tase 



20. DISTRIBUTION /AVAILABILITY OF ABSTRACT 

IS UNCLASStFlED/UNUMITED □ SAME AS RPT Q DTIC USERS 



21 ABSTRACT SECURITY CLASSIFICATION 

unclassified 



22a. NAME OF RESPONSIBLE INDIVIDUAL 
11 



22b TELEPHONE (Include Area Code) 

202 /696-4318 



22c Of^»C£ SY*>«BOL 

ONR 1142CS 



OOForm1473« JUN 86 



c 



Previous editions are ofti o/ete. 
S/N 0102-LF-014^6603 



SECURITY CLASSIFICATION OF THIS PAGE 



SECURITY CLASSinCATlON OF THIS PAGE 

Block NO. 19 - Abstract- Contiauedj 



with respect to the 
learning. 



skill of spectrogram reading and theories of perceptual attentioi 



00 Form 1473. iUN 86 (Reverse) 



SECURITY CLASSiFiCATlOiM OF THIS PAGS 



Difficulties in Learning 

2 



Abstract 



This work examines possible sources of training difficulty encountered 
by learners of speech spectrogram reading. Such difficulty has been 
attributed to the context-dependent nature of the visual segmentation 
of spectrogram patterns {liberman et al, I968)» and suggestions by 
researchers of other difficult skills (Biederman & Shiffrar, 1983) have 
also implicated visual segmentation. In both cases* the discriminations 
necessary to distinguish important parts can be easily made once 
identifted, but are enormously difficult to discover. The experiments 
presented here used a pseudo-spectrogram reading task which varied 
the segmentation rules subjects were required to discover. Experiment 
I found that considerable learning difficulty could be produced by this 
task, but confounded the source of that difficulty among several 
factors. The second experiment attempted to identify the sources of the 
difficulty. Segmentation was found to contribute significantly. The 
salience of the important cues, and, potentially, the demands of the 
learning task were also found to increase the difficulty of discovering 
Important visual distinctions. These results are discussed with respect 
to the skill of spectrogram reading and theories of perceptual attention 
learning. 



DlfHculties In Learning 

3 



DS£Sciilty In Learning to Read Speech Spectrograms: 
The Rote of T^ual Segmentation 

When acquiring a new perceptual skill, a learner is usually 
faced with the problem of learning to recognize new features and 
discovering which combinations of features form meaningful patterns. 
In X-ray reading, for example, a student must learn which features 
indicate normal tissue and which indicate diseased tissue. Such 
learning is cognitive: the visual system breaks up the visual array into 
parts and recognition responses occur to learned features, but 
cognitive processing and training are required to make decisions about 
which parts are Important and how they combine to form hl^er-level 
patterns. 

Theories of perceptual learning have characterized this cognitive 
processing as an hypothesis-and-test procedure (Levlne. 1975; 
Trabasso & Bower, 1968) which results in the building of pattern 
detectors (Kahneman. 1973: Chase & Simon. 1973). More recently, 
attention has focused on the types of preferences or heuristics which 
may be required to constrain hypothesis search in complex displays 
(Michalski. 1983: Medin. Wattenmaker & Michalski. 1987). One type of 
constraint the cognitive system must make is where to draw object 
boundaries. i.e., which parts belong together as objects. Characteristics 
such as spatial relations, overlap, proximity, and shading differences 
may play a role In determining object coherence fIViesman. 1988). For 
certain perceptual skills, however, such segmentation decisions can 
create difficulties. For example, in x-ray pictures, brightness 
corresponds to the density of tissue rather than any reflective property 
(Squire, 1988). Hence, visual contours and separations may not 
correspond to organ or tissue boundaries. For example, if two organs 
of equal density abut, no contour will appear between them. A 
radiology student needs to learn a new way of segmenting an x-ray 
picture to identify the locations of organs and other tissue groups. 

The present research is concerned with learning difllculties 
which may result when visual segmentation does not correspond to 
object segmentation. Its focus is on a skill which until recently was 
considered extremely difiicult if not impossible to learn: speech 



7 



Difficulties In Learning 

4 

spectrogram reading. Much of the difficulty in spectrogram reading has 
been attributed to problems in segmenting the display. The goal of this 
research was, first, to show that learning difBculty could be produced 
by violating segmentation assumpUons. and. second, to look at how 
segmentation interacts with other stimulus and task variables. 

A speech spectrogram is a graph of the energy in different 
frequency components of speech over a short sampled time. Its two 
axes represent frequency and time, and the darkness of a small region 
represents the amount of sound energy at the frequency and time 
matching Its coordinates. When real-time spectrographic displays were 
first developed. It was hoped that people. especlaUy the hearing 
Impaired, could be taught to recognize speech by seeing It However, 
learning to idenUfy speech from this graphical display has proven to be 
difficult, requiring both an understanding of acoustic-phonetics and 
many hours of practice. Potter. Kopp, and Green (1947). in one of the 
earliest efforts toward such training, taught a ^oup of subjects to 
Identify important acoustic features in spectrograms and then had 
them tiy to commimicate with each other using a ' real-time 
spectrogmphlc display. They found that the time to learn the most 
common words spoken by a single person increased linearly with 
practice, at the rate of about 4 words per hour. That is, prior learning 
did not aid the learning of new words. A similar learning rate was 
found by Greene, Hsoni. and Carrell (1984), who had naive subjects 
learn to identify spectrograms of 50 words made by a single speaker. 
The subjects began with four words and were gradually given 
additional sets of four words over 22 sessions. After about 13 sessions 
the subjects were able to learn the new items with few errors and 
show a fair amount of transfer to a new list of words by the same 
speaker (91.3%) and the original word list spoken by a different 
speaker (76%). These studies have been viewed optimistically as 
demonstrating that people can be trained to recognize visual speech. 
However, the studies are limited by their use of speech from a single 
speaker, or by their focus on learning of individual words which would 
not generalize well to continuous speech. 

More impressive has been the effort of Dr. Victor Zue. who has 
taught himself to read spectrograms of continuous speech. 



8 



DilRculties in Learning 

5 



Independent of speaker, with a high level of accuraQr (Cole. Rudnicky. 
Zue. & Reddy. 1980). Zue systematically studied spectrogram pattevns 
for one hour per day over several years. This extensive practice along 
with his expertise in acoustic-phonetics has discovered features and 
rules which enable him to identify phoneme segments with an 
accuracy of about 85%. The features Zue uses are spectial pattern 5 
unique to individual phonemes, but he augments simple detection of 
these fixtures with knowledge of coarticulation effects, which can 
distort the features, and a knowledge of phonotactic constraints in the 
En^h language. Zue has also been successful in identifying the rules 
he uses to recognize phonemes and in teaching others to use the rules 
to read spectrograms with much less practice (40 hrs vs 2000 hrs) 
(Cole & Zue, 1980). 

But what Is the original source of the difficulty which limited 
subjects in early studies to small vocabularies, and required 2000+ 
hours of training plus acoustic-phonetic knowledge on the part of 
Victor Zue? In an article entitled "Why are speech spectrograms hard 
to read?". Liberman et al (1968) identify the major reason for this 
learning difficulty as the context-dependent nature of the acoustic 
signal. How a sound is articulated, and hence how it appears on a 
spectrogram, depends on what other sounds are made immediately 
before and after it. A vowel following a /d/ will look different from one 
following a /g/. Context dependency leads to a special learning 
difficulty because of the inherent difference between the way the visual 
and auditory systems segment the acousUc pattern. To the visual 
system, a vowel followed by a stop consonant appears as a vrtde dark 
band beside a narrow dark band with a blank space in between, i.e.. 
two distinct objects. However, the auditory segmentation of those two 
sounds is more overlapping and blurred; part of the stop sound is due 
to the vowel transition. Uberman et al (1968) saw this difference 
between the auditory and visual systems as so fundamental that they 
asserted "no amount of training will cause an appropriate speech 
decoder to develop for a visual input" (p. 131). Victor Zue has proven 
their appraisal wrong, but he has also shown that their analysis of the 
source of difficulty may be correct: much of his ability is based on his 
knowledge of coarticulation (context-dependent) effects. 



9 



Difficulties in Learning 

6 



Why should context dependency and its associated segmentation 
problem produce learning difficulties? According to Ubemian et al 
(1968). the nature of the speech code Is such that while the auditory 
system has developed to deal with its temporal properties, the visual 
system is not capable of processing It in a spatial layout. Yet Victor 
Zue's performance demonstrates that it can be accomplished. The 
quesuon then is why. from a perceptual learning point of view, 
context-dependent features are difficult to identify. One suggestion 
comes from recent work by Biederman and Shiffrar (1987) on chick- 
sexing. Biedennan and Shif&ar (1987) demonstrated that for the skill 
of determining the gender of day-old chicks, training time could be 
drastically reduced by identifying non-accidental disUnguishing 
features. Chick-sexing reportedly takes several years of essfinUally trial 
rmd error practice to achieve high proficiency. By identifying simple 
invariant features. Biederman and Shiffrar were able to reduce these 
years of training to a simple rule for finding a distinguishing contour. 
Although they didn't show why learning was originally so difficult. 
Biederman and Shiffrar hypothesized that the critical distinguishing 
features were obscured by their small size and by being embedded in 
other parts. In such cases, they concluded, it is better to provide 
instruction which points out the features than to hope they will be 
discovered by the learner. 

The same causes of difficulty may apply to the reading of speech 
spectrograms. The context-dependent nature of the speech signal 
causes the visual system to break up the display in inappropriate 
places. AddiUonally. cogniUve processes may be more likely to group 
certain parts together into objects and restrict attenUon to these object 
units (Ceraso, 1985: Kahneman, 1973). This may produce search 
dif^culties if features required to identify one pattern are spread across 
different objects. An otherwise noUceable disting'iishlng feature may be 
difficult to discover because it is in another "part." This hypothesis is 
examined in the experiments which follow. 

The question of interest to the present work is whether the 
difficulty of learning spectrogram reading is produced by context- 
dependent relations among visual features. To enable experimental 
manipulation of the relations of interest, pseudo-spectrograms were 



10 



DURculttes in Learning 

7 



used. A computer program generated these pseudo-spectrograms based 
on feature descriptions and interaction rules of real speech 
spectrograms (Zue. unpublished). A general resemblance to actual 
spectrograms was maintained. 

The patterns used in the experiments were composed of two- or 
three-phoneme syllables in a vowel-consonant or consonant-vowel- 
consonant order. EExamples of the pseudo-spectrograms used In 
Experiment 1 are shown in Figure 1. The corxsonants used were the 
stop consonants A>/. /p/» /t/. /k/, /d/, and /g/. The vowels used 
were /i/ as to "beet/' /u/ as in "boot." /ae/ as to "bat." /e/ as in 
"bait." /d/ as to "bought." and /o/ as to "boat." Vowel patterns were 
quite similar to each other and appeared as wide striated areas with 
two dark formants (Fl and F2) and one lighter formant (F3). Vowels 
differed from each other by their width and the height of their three 
formants. 

The purpose of the two experiments described below Is. first, to 
demonstrate that a context-dependent discrimination (i.e.. one whose 
features cross an object boundary) can produce learning difficulty In a 
pseudo-spectrogram reading task; and second, to look at what 
contribution segmentation, as distinguished from other factors such as 
salience, makes to that difficulty. 

Experiment I 

To examine the difficulty of learning a context-dependent 
discrimination, a task was set up to compare the learning of three 
pairs of consonants. These pairs were /b/-/p/. /t/-/k/. and /d/-/g/. 
Because the objective was to look at withto-palr discriminations, 
between-palr discriminations were made simple by giving members of 
the same pair similar widths, but members of different pairs very 
different widths. Hence, /b/ and /p/ were both very thto. /t/ and /k/ 
were both wide and /d/ and /g/ were both of medium vidth. Wlthto- 
pair discriminations were of three types: mult ?le cue. single cue. and 
stogie context-dependent cue. The consonants /b/ and /p/ differed 
from each other m texture, shape, and width, and could be 
disttaguished on any of these dimensions. The consonants /t/ and 



11 



DifUcuIties In Lemming 

8 

/k/ could be reliably distinguished only by a single cue. They had the 
same shape, width, and texture, but a different number of formants. 
The consonants /d/ and /g/ could also be distinguished only by a 
single cue. but this cue could not be found by looking at the 
consonant pattern itself. The shape, width, and texture of /d/ and /g/ 
vs^ere identical and the only way to tell them apart was by their 
influence on an ac^acent vowel. All of the consonants, except /g/» 
made the second and third formants of an adjacent vowel curve 
slightly downward at the consonant-vowel boundary. The consonant 
Igi made the second and third formants curve toward each other and 
meet at the consonant-vowel boundary (velar pinch). 

The prediction for the e?q?eriment was that the context- 
(dependent discrimination would be more difficult to learn than either 
the single or multiple cue discriminations. 

Method 

Subjects 

Ten subjects were recruited from the University of Pittsburgh. 
The subjects received credit towards an introductory psychology class 
and $10 for their participation. 

Apporaim 

The pseudo-spectrogram patterns were shown to the subjects on 
the high resolution display screen of a XEROX 1108 computer. 
Subjects responded by using a mouse to make selections from a 
screen menu. The computer collected the subjects' responses and 
provided accuracy feedback to them. 

MoXeridis 

The pseudo-spectrogram patterns were generated by a computer 
program as screen bitmaps. The patterns were 346 X 346 pixels and 
measured 10 cm X 10 cm on the display screen. The phoneme 
patterns were drawn from descriptions which mapp>ed a random 



12 



DifficuHies in Learning 

9 



texture of a particular shade of grgr to different regions of the space 
the pattern was to occupy. The patterns were drawn as lines of these 
small texture patterns, the length of which was predetermined except 
when a line bordered a blank area. In that case the ending point of 
the line was set to a random number within 10 pixels (3 mm) of its 
predetermined ending point. Texture and line-length randomization 
thus provided a small amount of random variability in reappearances 
of the same phoneme. 

The patterns for the phonemes /b/ and /p/ were thin long lines 
of either a more striated (/b/) or more random Up/) texture. For the 
phonemes /t/ and /k/, the patterns were a background of random 
texture with either a single dark area (for /k/} or two dark areas (for 
/t/). Because the descriptions for the background textures of /t/ and 
/k/ were identical, the only reliable way of distinguishing between 
them was the presence of the extra dark area in /t/. The phonemes 
/d/ and /g/ appeared as long striated patterns before a vowel and as 
short striated patterns with two appendages after a vowel, but because 
their descriptions were identical, the only reliable way to distinguish 
between them was by the convergence or lack of convergence of the 
formants in the adjacer*: vowel. Vowel patterns appeared as a striated 
uiiform ackground with two dark lower bars below a lighter bar. 
Vowels could be discriminated by the amount of space between their 
formants. When vowel formants were curved by the presence of an 
adjacent /g/, only the center of the pattern could be used to 
determine the real distance between formants. 

Design 

Subjects participated in four one-hour sessions held on 
consecutive days except for one of the subjects who participated in 
only three sessions but learned all of the discriminations. The 
spectrogram patterns the subjects saw were all possible consonant- 
vowel-consonant combinations of the consonants /b/, /p/. /t/, /d/. 
/g/. /k/, and the vowels /I/, /e/. /ae/, /o/, /u/. The total 
number of different combinations was 216. Half of these "words" (108) 
were used in each session so that after four sessions the subjects saw 
each word pattern only twice. To control for the frequency of seeing 



13 

. ERIC 



Difficulties in Learning 

10 



each phoneme, the words were blocked Into groups of six in which 
each consonant appeared once in prevocalic and postvocalic form, and 
each vowel appeared once. A subject saw IB such blocks in each 
session. Before each session, the order of the words within each block 
and the order of the blocks within the session were randomized. 

Procedure 

• 

Subjects were tested individually. A subject was seated in front 
of the computer and shown how to use a mouse to choose a letter 
response from a screen menu. The experimenter then briefly explained 
about spectrograms and told the subject that his or her task was to 
leam which letters were represented by each pattern. It was made 
clear, however, that the task was a visual one, and the subjects were 
discouraged from using strat^es based on the sound properties of the 
phonemes, such as stress or pitch. 

When the experiment began, a pseudo-spectrogram pattern 
appeared in the center of the display screen and remained there until 
a response was given. Immediately after the pattern's appearance, the 
message '"Think about your answer..." appeared above the pattern in a 
message box. Because of program differences, three of the subjects 
saw this message on the screen for 20 seconds, while for the 
remaining subjects the message remained on the screen for only 3 
seconds. This difference was not expected to influence the results 
because most responses, especially early in the experiment, required 
more than 20 seconds. Next, a menu appeaifed on the screen along 
with the message "Click on the first sound in the word." The menu 
contained a list of the consonant responses and an example word in 
which the consonant is used. After a subject selected one of the 
consonants, a vowel menu appeared with the message "Click on the 
second sound in the word." Once the vowel was selected, the 
consonant menu reappeared for the third response. After the subject 
made the final response, the program provided feedback. If all three 
responses were correct, the message "That's correct" was displayed in 
the message box. Otherwise, the message "That's wrong" was displayed 
along with the correct answer. The pseudo-spectrogram pattern 
remained on the screen for five seconds after feedback was given. The 



14 



Difficulties in Learning 

11 

subjects were allowed to take a short break halfway through the 
session. 

Shortly after the beginning and toward the end of each session, 
the experimenter turned on a tape recorder and aslced the subject to 
continue with the next six trials but describe verbally what he was 
looking at in the pattern and how he decided what to respond. 

Results and Discussion 

A subject was considered to have learned a consonant pair If he 
or she responded correctly to four consecutive trial blocks (8 problems) 
with one allowed error on the third or fourth block. The learning point 
was taken as the first of the four blocks. Not aU of the subjects were 
able to learn all three consonant discriminations within the allotted 
time. Of the 10 subjects. 9 learned the /b/-/p/ distinction. 6 learned 
the /t/-/k/ distinction, and 2 learned the /d/-/g/ distinction. 
McNemar's exact test for correlated proportions indicated that 
significantly more subjects learned the /b/-/p/ distinction than the 
/d/-/g/ distinction (p<.02), but the test of whether more peo^^'e 
learned the /t/-/k/ distinction than learned the /d/-/g/ distinction 
was not significant {p=^. 10). 

A matched pairs sign test was used to test whether the learning 
points for the /b/-/p/ and /t/-/k/ distinctions were earlier than for 
the /d/-/g/ distinction. Unlearned distinctions were considered to 
have a learning point of at least 73 (i.e., one greater than the last 
block). If two distinctions were unlearned, the learning points were 
considered to be tied. Using this procedure, the /b/-/p/ and /t/-/k/ 
distinctions were found to have been learned at an earlier point than 
the /d/-/g/ distinction {p<.01 an<l pc.02 respectively). 

To obtain a measure of how much earlier the single- and 
multiple-cue distinctions were learned, it was necessary for the 
subjects to have learned to distinguish at least two of the three 
consonant pairs. Four subjects failed to meet this criterion and were 
not included in the measure. Of the six remaining subjects, only two 
learned the /d/-/g/ distinction. For the others, the learning point was 



15 



DlStcultles in Learning 

12 



estimated as 73. Because this value underestimates the true learning 
point, the measure of when the /d/-/g/ distinction was learned is 
conservative. Based on this measure, the mean number of trial blocks 
required for subjects to leam each consonant pair discrimination is 
provided in Table 1. According to these estimates, the /d/-/g/ 
distinction appears to require a considerably greater amount of 
learning time than either the /b/-/p/ or /t/Vk/ distincUons 
(approximately 40 additional blocks). 



Consonant Distinction 



Multiple Cue Smgle Cue Context Cue 
IhtM. ItUM /d/-/g/ 



Mean 20.17 29.17 66.17 

Standard deviation 17.81 23.26 12.17 

Number of estimated 0 0 4 

points 

Table 1: Mean number of trial blocks to reach learning 
criterion for each consonant distinction. 



These results suggest that a context-dependent discrimination 
can be difRcuit to leam. Fewer subjects were able to leam the /d/-/g/ 
discrimination in the allotted time. The test on proportion of learners 
for each distinction showed that significantly more people learned the 
multiple-cue distinction than the context-dependent one. The 
difference between the proportion who learned the single-cue 
distinction and the context-dependent one. though not significant, was 
large (.60 vs .20). For those subjects who did leam the context- 
dependent discrimination (or who were optimistically presumed to be 
about to leam it when the experiment ended), learning took longer 
than for either the multiple cue or the single cue discrimination. 
These findings suggests that having to discover a context-dependent 
discrimination could account for some of the difficulty encountered In 
acquiring the skill of speech spectrogram reading. 



16 



Dlfilculties in Learning 

13 



However, these results must be viewed with caution. The 
e3q)eriment examined learning of a realistic and complex pattern, and 
likely confounded several factors with the context-dependent vs non- 
context-dependent comparison. These factors must be ruled out before 
learning dUOdculty can be unambiguously- assigned to the context- 
dependent maimer in which the stimulus is segmented. One such 
factor is cue salience. It may simply have been harder for the subjects 
to notice the formant curving cue than the other cues. This 
explanation is unlikely given that 8 of the 10 subjects mentioned in 
their verbal reports that there was something unusual about the 
appearance of the formants {i.e.. that they were curved or straight). 
Nevertheless, salience differences must be ruled out. Another 
confounding factor is whether task demands, rather than segmentation 
difBculty. made the /d/-/g/ distinction difRcuU to learn. Subjects may 
have noticed the formant curving cue, but because they also were 
required to learn the identity of the vowel, may have tried to use 
formsuit curving to distinguish among the different vowels. This may 
have "used up" the cue, making it unavailable for use in distinguishing 
the consonants. There is support for this possibility In the verbal 
reports made by several subjects who mentioned the formant curving 
in conjunction with vowel discriminations. These two possible 
alternative explanations are examined in Experiment 2. 

Eaqperiment 2 

In Experiment 2, the goal was to try to determine whether the 
learning difficulty observed in Experiment 1 was due to context- 
dependent segmentation, to some other factor such as salience or task 
demands, or to some interaction of these factors. Segmentation. In this 
context, refers to how the cognitive system divides a pattern into 
objects. Segmentation was mEinlpulated by having two cues occur 
within the same object or by splitting them between two objects. 
Salience Is how noticeable the features are. This was measured by 
having a separate group of subjects circle the parts in the spectrogram 
patterns used in this experiment. It was also controlled for in the 
experimental design by having different groups of subjects learn each 
distinguishing cue both as a between-object cue and as a withln-object 



ERIC 



17 



DifficulUes in Learning 

14 



cue. Finally, task demands refer to whether the subject was to treat 
the different phonemes as separate parts in making a response. In this 
experiment subjects made only a single response to the whole pattern, 
but an attempt to vary task demands was made through instructional 
bias. 

Method 

Materials 

The pseudo-spectrogram patterns used in Experiment 2 were 
similar to those used In Experiment 1, but to control lor all of the 
independent variables, several changes were made. Ttrst. the- patterns 
consisted of only two phonemes: a vowel-like patieni. followed by a 
consonant-like pattern. The vowel patterns were eithv»r thin fl^ or wide 
CW). and had formants which were either straight (S) oi curved (C) and 
either high (H) or low (L) in frequency (/!/ vs /ae/). Consonant 
patterns could be large (L) or small (S) and had either one (O) or two 
fll formants. Formants appeared as dark spots on the large 
consonants and as protrusions on the small consonants. Figure 2 
shows some examples of these patterns. Hie pseudo-spectrogram 
patterns were generated in the same way as those in Experiment 1; 
the 32 vowel-consonant combinations were drawn 8 times for a total of 
256 patterns. 

To assess the salience of the patterns* visual features, a group 
of 15 subjects (not the same as those in the learning task) were given 
a stack of the 32 different patterns and asked to circle the "important 
parts." The results of this circling task are given in Table 2. Of 
relevance to the present experiment is the finding that the subjects 
circled the vowel formants an average of 98% of the time, while 
circling the consonant formants an average of only 76% of the time. 
Furthermore, the subjects tended to circle curved vowel formants as a 
single part (67% of the time), and straight vowel formants as separate 
parts (83% of the time). The first consonant formant was circled more 
often than the second (81% vs 68%). and formants in the large 
consonants were circled more often than formants in the small 
consonants (90% vs 59%). Hence, some of the difference in salience 



-18 



DifHcuIties in Learning 

between vowd formants and consonant formants may be due to 
difHcuIty seeing the small consonant formants as distinct parts. 



Feature Proportion 

Whole Vowel .13 

1st Vowel Formant .97 

2nd Vowel Formant ,99 

3rd Vowd Formant .98 

All Other Vowel Features .22 

Whole Consonant .33 

1st Consonant Formant .83 

2nd Consonant Formant .69 

All Other Consonant Features .29 



Table 2: Proportion of tim^ a feature was circled in part- 
circling task. 



Design 

The goal of the experiment was to assess whether a 
within-object cue would be learned more readily than a between-object 
cue. To avoid confounding the type of cue (formant curving or number 
of formants) with the location of the cue {within or between objects), 
each cue type was learned as both a within-object cue and as a 
between-object cue. Because this could not be manipulated within 
subjects, an incomplete blocks design was used. Each subject provided 
two observations from the 2 X 2 (Cue Type X Cue Location) design, 
and a block of two subjects with complementary conditions constituted 
a sin^e replication of the design. This confounds the Cue Type X Cue 
Location interaction with subjects* but by running enough replications, 
this effect could be analyzed as a between block factor. 

One additional factor. Instruct on. was also included as a 
between block factor. One half of the blocks received neutral 
instructions which asked them to learn to associate the whole pattern 



ERIC 



19 



Difficulties in Learning 

16 



with a response, the other half received biasing instructions which 
asked them to learn the half of the pattern containing the 
within-object cue. The within and between block designs made up four 
conditions: Neutral InstrucUons. Cunre-Withln (NCW): Neutral 
Instructions* Curve-Between (NCB); Biased Instructions. Curve-Within 
(BCW); and Biased Instructions. Curve-Between (BCB). The 
Curve-Within/Curve-Between distinction refers to the type of rules 
subjects were to learn. Table 3 shows these rules for each condiUon. 



Cons. 


CondltloKi 


Left Pattern 


Right Pattern 




(Instructions- 








v^uxve locationj 








Neutml-Withln 


K^iXl VCU« AIJULlI 




/d/ 


fNCWl 






/k/ 




Wide 


One formant 


/t/ 




Wide 


Two formants 


/g/ 


Neutral-Between 


Cunred 


Small 


/d/ 


(NCB) 


Straight 


Small 


/k/ 






Large and One formant 


It/ 






La.'ge and Two formants 


l%l 


Biased-Withln 


Curved.Thin 




/d/ 


(BCW) 


Stralght.Thin 




/k/ 




Wide 


One formant 


/t/ 




Wide 


Two formants 


/g/ 


Biased-Between 


Curved 


Small 


/d/ 


(BCB) 


Straight 


Small 


/k/ 






Large and One formant 


/t/ 






Large and Two formants 



Table 3: Rules for discriminating patterns In Experiment 2 



20 



Difliculties in Learning 

17 



The Curve-Within groups learned the formant curving cue as a 
within-object cue and the number of formants cue as a between-object 
cue; the Curve-Between i^ups learned the formant curving cue as a 
between-object cue and the number of formants cue as a within-object 
cue. 

Subjects participated In a single two hour session. The 
pseudo-spectrogram patterns the subjects saw were all possible 
vowel-consonant combinations as described above. To control for the 
frequency of seeing each phoneme, the patterns were grouped into 
blocks of eight in which ^ch consonant apptired twice and each 
vowel appeared once. Before ^ch session, the order of the patterns 
within each block, and the order of the blocks within the session were 
randomized for each subject. 

Procedure 

Subjects were tested individually. Each subject was seated in 
front of the computer and shown how to use a mouse to choose a 
letter response irom a screen menu. Then the instructions for the 
experiment were displayed on the screen. Subjects in the neutral 
conditions were told their task was to learn to identify which pattern 
was displayed: subjects in the biased conditions were told to Identify 
the left (or right) pattern. To ensure that the subjects in the biased 
condition read the instructions, they were asked to identify which half 
(left or right) of the pattern they were to learn. If they were Incorrect, 
the instructions reappeared on the screen. 

The experiment began with a pseudo-spectrogram pattern 
appearing in the center of the display screen. The message 'Think 
about your answer..." appeared in a message box above the pattern for 
3 seconds. Then a menu appeared on the screen along with the 
message "CUck on the first sound In the word." The menu contained a 
list of four responses: /t/, /k/, /d/, and /g/. After the subject made a 
response, the program provided feedback. If the response was correct, 
the message 'That's correct" was displayed in the message box. 
Otherwise, the message "That's wrong* was displayed along with the 
correct answer. Once feedback was given, the pseudo-spectrogram 



21 



DlfHculties in Learning 

18 



pattern remained on the screen for 10 seconds before being replaced 
by the pattern for the next trial. Every 32 trials, the subject was 
allowed to take a short break before continuing. 

After the session, the experimenter turned on a tape recorder 
and asked the subject to Identify 8 patterns and describe what she 
looked at in the pattern and how she decided what to respond. 
Subjects 

Forty-ei^t introductory psychology students from the University 
of Pittsburgh participated for course credit. Two subjects, both from 
the Neutral-Curve-Between condition, were replaced: one quit the 
session earty. the other hadn't slept for 48 hours prior to the 
experiment session and showed no learning. The remaining subjects 
were randomly assigned to the four conditions with the constraint of 
obtaining 9 full or partial learners (as described below) in each 
condition. 

Results and Discussion 

As in Experiment 1, subjects had considerable diflicully learning 
both the within and between object distinctions. A subject was 
considered to have learned a distinction when correct responses were 
made on two consecutive blocks (8 problems) with one allowed error 
on the second block {two subjects were also considered to have learned 
a distinction on their final block if the final block was correct and they 
gave the correct rule for the distinction in their post-session interview). 
By this criterion, the 48 subjects fall into three categories: full 
learners, non-learners, and partial learners. Full learners were those 
who learned both the between and within object distinctions: 
non-leamers learned neither distinction: partial learners were those 
who only learned one of the two distinctions. Table 4 summarizes how 
the subjects performed. Ei^teen subjects were full learners, twelve 
were non-leamers. and eighteen were partial learners. Of the partial 
learners. 13 learned only the within rule and 5 learned only the 
between rule. Of the non-leamers, one was from the NCB condition, 
two from the BCW condition, and nine from the NCW condition. 



o 

ERIC 



22 



DlfRcuities in Learning 

19 



Discriminations learned 



prcw 



NCB 



BCW 



BCB 



Both discrixDinations 
One discrimination 
^R^thin rule only 
Between rule only 



5 
0 
9 



4 



9 



0 
0 
I 



7 
1 
2 



1 



1 
4 
0 



4 



Neither discrimination 



Table 4: Discriminations learned, by condition 



A matched pairs sign test was used to test the main eiTects of 
Cue Location and Cue lype for those subjects who were full or partial 
l^utjers. For partial learners* the learning point of the unlearned 
distinction was considered to be at least 17 (the last trial block plus 
one). By this test, the main effect of Cue Location was not significant 
(ztol.39. p<.09). but the main effect of Cue lype was significant 
(z!s=4.l8, p<.001). The subjects learned the fonnant curving cue before 
the number of fonnants cue significantly more often than they learned 
them in the reverse order. To test the interaction of Cue Type X Cue 
Location, each subjects performance was categorized according to its 
sign. A chi-square test of independence revealed that the interaction 
was significant (x'(2)= 19.35, p<.001). Formant curving was learned first 
as a withih-object cue Just as often as it was learned first as a 
between-objects cue. but the number of fonnants cue was learned first 
as a within-object cue more often than as a between-objects cue. 

To obtain a measure of when the distinctions were learned, the 
learning point for unlearned distinctions was estimated as the 17th 
block. This value underestimates the true learning block and makes 
the measure conservative. Most of these estimations were made for the 
between-object distinction when It involved the number of formants 
cue. This is also consistent with the observation that an unusually 
large number of non-learners were found in the conditions which 
required learning this distinction (the NCW and BCW condiUons). 
Making these estimations, the mean learning block for each distincUon 




23 



DUIicuIties in Learning 

20 



and condition was calculated. These values are given in Table 5. The 
measures indicate that the number of fonnants cue was learned at 
least ave blocks earlier as a wtthin-object cue than as a between-object 
cue. but the formant curving cue was learned at about the same point 
for both locations. 

Cue Localion 

Cue Type 





Mthin 


Between 


Number of fonnants 






Mean 


10.28 


15.56 


Standard deviation 


5.04 


2.59 


Number of estimated points 


4 


12 


Fonnant curving 






Mean 


8.72 


7.44 


Standard deviation 


4.23 


3.75 


Number of estimated points 


1 


1 



Table 5: Mean number of trial blocks to reach learning criterion for each consonant distinction 



These results suggest that lack of salience may play an 
important part in making this type of sklU difficult to learn. The sign 
test demonstrates that the formant curving cue was more often learned 
before the ntmxber of fonnants cue. and the estimates of learning 
points shows that the formant curving cue was learned at least 4 
blocks earlier, on average. The cause of this diiference is likely to be 
cue salience. In the part circling task, more subjects circled the vowel 
fonnants than the consonant fonnants. suggesting that the vowel 
formants are more salient. The effect of salience, however, does not 
explain the learning difficulty observed in the first experiment. In 
Experiment 1, number of formants as a wlthm-object cue was learned 
sooner and more often than the formant curving cue as a between- 
object cue. If this were due to salience, then we should have found 
that the number of formants cue was learned sooner than the formant 
curving cue in Experiment 2. 

24 

ERIC 



DilBcuIties in Learning 

21 



Nor can segmentaUon by itself account for the observed learning 
difficulty. Cue Location was not slgjiiflcant, and even the interaction of 
Cue TVpe and Cue Location does not produce a simple explanation. 
Context-dependent segmentation does appear to produce learning 
difficvilty. but this effect may be restricted to cues of lower salience. 
Ihe chi-square test on the interaction of Cue Type and Cue Location 
showed that more subjects learned the formant curving cue before the 
number of fonnants cue when the number of formants cue was a 
between-objects cue, but when the number of formants cue was a 
within-objects cue» the order of learning was indtiferent to cue type. 
Thus, difficulty due to cue location was found for the less salient 
number of formants cue but not for the more salient formant curving 
cue. However, the degree of impairment for less salient cues appears 
to be substantial. More non-learners (11 vs 1) and within-rule-only 
learners (12 vs 1) were reported in the conditions which required 
learning the number of formants cue as a between-objects cue. 
Additionally, the conservative ^timate of learning points indicates that 
this cue was learned at least five blocks later as a between- than as a 
within-object cue. 

Yet segmentation does not explain the learning difficulty 
observed In the first experiment. In Experiment 1. the formant curving 
cue as a between-object cue was found to be much harder to learn 
than the number of formants cue as a within-object cue. This finding 
was not replicated in the second ejcperiment. In fact, the opposite was 
found. Neither salience nor segmentation can account for this 
difference because neither was changed between the two experiments. 
The only major c^^ange was the learning task. 

Presumably, the reason the formant curving cue was difficult to 
learn in Experiment 1 was the vowel response required in that task. 
This was not manipulated in the second experiment, so It is impossible 
to be certain. It Is interesting to note, however, that the difficulty 
disappeared when tb-j vowel identification task was eliminated in 
Experiment 2. Unfortunately, the manipulation of Instructional bias in 
this experiment was too weak to clarify this question. Half of the 
subjects were instructed to "learn to Identify the right [or "left"! hand 



Difficulties in Leamini 



part" of the pattern, but in post-experiment interviews several admitted 
to ignoring these instructions. Instnictional bias did not significantly 
interact with either Cue Type or Cue Location (x*(2)a3.31, p>AO, 
X*(2)=:3.82, p>.10. respectively). Future research should determine 
whether task demands cause the difBculty observed in Experiment 1 
by more strongly manipulating task demands within a single 
experiment. 

General Discussion 

The two experiments presented here point to several factors 
which can affect the difHculty of learning to read speech spectrograms. 
The original hypothesis, that learning difficulty was caused by context- 
dependent relations created by the way the visual S3rstem segments 
spectrogram patterns, has been shown to be too simple. Learning 
difficulty for this skill may be affected by the interaction of 
segmentation with cue salience and task demands. Segmentation was 
shown to have a considerable influence on difficulty, but this influence 
may be restricted to less salient cues. Segmentation may also be 
influenced by the demands of the learning task. Although the 
experiments did not demonstrate this, it Is likely that the type of 
response required by the learning task influences task difficulty. The 
following discussion examines tn more detail why segmentation might 
interact with these factors. 

The interaction of segmentation with cue salience can be 
explained by assuming that whatever learning difficulties are produced 
by segmentation can be overcome by a highly salient cue. Salience 
has long been known to influence hypothesis selection in 
discrimination learning tasks fTtabasso & Bower, 1968). Highly salient 
cues are likely to be tried first as hypotheses. If the effect of 
segmentation is to make certain cues less available for selection as 
hypotheses, then it is easy to understand why a high degree of 
salience would overcome this effect. This explanation is supported by 
the results of the second experiment reported here, in which the mean 
learning block was about the same for all distinctions except for the 
condition when the less salient number of formants cue was a 
between-objects cue. When the formant curving cue was a between- 



26 



Difficulties in Learning 

23 



objects cue, its highly salient nature made it available for attention 
anyway. 

Although neither experiment directly mianipulated task demands, 
the difference between the results of the two experiments suggests that 
the type of response the subjects were required to give was also 
important. In the first experiment, where the subjects were required to 
respond to both consonants and vowels, th^ had dilBculty learning 
the highly salient formant curving cue as a between object cue. In the 
second experiment, where subjects made only a single response to the 
whole pattern, formant curving was no more difficult to learn as a 
between-object cue than as a within-object cue. Since subjects in 
Experiment 1 reported using formant curving to distinguish the vowel 
responses, it seems likely that including the vowel response made it 
more difficult to notice the relevance of the formant curving to the 
consonant distinction, perhaps In the following way. A subject might 
select the cue as a hjrpothesis for vowel identification. When this 
hypothesis was disconfirmed, the hypothesis may have become less 
likely to be selected immediately again. If the formant curving cue was 
selected as relevant for vowel discrimination because of the way that 
spectrograms are segmented visually, it might be less available for part 
of a consonant discrimination. In the second experiment, when the 
vowel identification task was eliminated, subjects were more able to 
learn formant curving as a between object cue. 

Task demands may also have increased learning difficulty by 
reinforcing any existing segmentation biases. If subjects were required 
to make two responses to a pattern, they may have been more likely to 
see the pattern as two distinct parts, and possibly to assign one 
response to one part, and the other response to the remaining part. 
This may have enhanced any existing bias against crossing part 
boundaries. This hypothesis can be tested only by future research. 

The main conclusion of the present research Is to confirm the 
Infiuence of segmentation on learning difficulty in speech spectrogram 
reading. Althou^ segmentation was not found to be the sole 
determiner of such difficulty, in combination with other stimulus and 
task variables it appeared to have a substantial infiuence. One way of 



Difficulties in Learning 

24 



thinking about the effect of segmentation is as a within-object search 
bias. People may be biased toward searching within an object's part 
boundaries (contour) for discriminating features, before considering 
features outside those boundaries. This bias, however, can be over- 
ridden by a highly salient feature in another part. The learning task 
is also important to the within-object search bias. If a feature can be 
used as a within-object cue, then it may be less likely to be considered 
as a between-object cue. Such factors may have led the subjects in 
Experiment 1 to believe incorrectly that formant curving indicated 
vowel identity, and may have Impaired their ability to associate it with 
consonant identity. 

The existence of a within-object search bias is consistent with 
several theories of visual attention. According to the view taken by 
Kahneman {1973: Kahneman & Henik. 1981) and Ceraso (1985). 
attention to a visual scene is allocated by object units. According to 
Kahneman's (1973) model of attention and perception, preattentive 
visual processes divide a display into units according to stimulus 
properties and simple grouping rules (such as Gestalt rules). These 
units are given flgural emphasis (attention) based on factors such as 
ilgure-ground relations, features which make something STAND OUT. 
and intention. Units which receive this attention are then matched 
against memory structures to test for recognition. Visual search 
involves the intentional switching of flgural emphasis from object to 
object, or the attraction of flgural emphasis based on a feature (either 
stimulus or response selected) which distinguishes the target. 
According to the results of the experiments presented above, the 
features of a target phoneme unit are more likely to be considered 
than features of other phonemes, unless those other features are 
highly salient. This result may be due to the way attention is allocated 
to a whole part unit. If whole phonemes are attended as wholes, then 
the features within the attended phoneme will receive flgural emphasis 
and be further processed as potential hypotheses. However, if a highly 
salient feature, one which draws attention to itself, is in a neighboring 
phoneme, it may be Included in processing and may even be selected 
earlier as a hypothesis. According to this attention-by-parts view, the 
within-object search bias may be the result of normal attention 
allocation policy within the visual system. 



28 



Difficulties in Learning 

25 



A withln-object search bias is also consistent with recent 
suggestions that preferences and heuristics are reqtiired to restrict the 
amount of search involved in concept learning (Michaiski, 1983; 
Medin. Wattenmaker & Michaiski. 1987). this view is not inconsistent 
with the attention-by-parts hjrpothesis, but emphasizes the functional 
role of such a bias in the learning proems. In complex visual 
environments, ordered search for important features (even salience 
ordered search) is too resource consuming to be viable. Rather, 
preferences for certain features or locations are required to restrict the 
scope of search. Restricting the search for a discriminating feature to 
the area within the object boundaries of a part is a sensible heuristic. 
In our normal visual perception, objects are classified or discriminated 
-by features within their own object boundaries. Only in certain 
artificial environments, such as speech spectrograms or x-ray pictures, 
are context-dependent relations set up by visual segmentation. In such 
environments, what is normally a useful heuristic actually hinders 
search rather than aiding it. 

In the second experiment, what was observed was not a 
facilitating effect for a within-object cue. but an increased difflculty for 
locating a between-object cue. Cues with low salience can be fairly 
easily .located when they are within the same object, but when a low- 
salience cue must be found in a nearby object, learning difilculty is 
increased, probably by a tendency to retry discarded within-object 
h3^theses. This result has important implications for speech 
spectrogram reading. First, it explains at least part of the enormous 
difBculty in learning the skill of speech spectrogram reading. In 
spectrogram reading, the large variability in the appearance of 
phonemes means that the salience of most features is likely to be 
qtdte low. Also, it is important to learn spectrogram patterns at Ihe 
individual phoneme level. Hence, the narrow focus induced by the task 
should be expected to increase the within-object search bias and 
impair discovery of context-dependent features. 

Some individuals, too, might be more affected by a search bias 
than others. For some, it may only slow down search, with the low- 
salience context-dependent feature found only after within-object 



Difficulties in Learning 

26 



features have been searched. For others, it may mean the complete 
abandonment of search after a within-object search has failed. Such 
difference depotid on an individuars repertoire of strategies and 
learning history. Fortunately for students of spectrogram reading. 
Victor Zue has Identlfled many of these features, so they do not have 
to be discovered anew. 

In most visual environments and for most perceptual skills, a 
wlthin-object bias is helpful. It restricts the amount of search required 
for learning. However, for other environments and skills, such as 
speech spectrogram reading, radiology, and passive sonar reading, 
where visual objects and real objects do not directly correspond 
{Lesgold et al, 1988: Liberman et al, 1968; Smith, 1982), it becomes a 
source of learning difHculty. Overcoming such search biases may be an 
important part of learning for these skills. 



30 



Difficulties m Learning 

27 



References 

Blederman. I.. & ShiSrar, M. M. (1987). Sexing day-old chicks: A case 

study and expert systems analysis of a diMcult perceptual- 

learning task. Jotmial of Experimental Psychology: Learning, 

Memory, and Cognition, 13, 640-645. 
Ceraso. J. (1985). Unit formation in perception and memory. In G. H. 

Bower OSd.), The psychology of learning and motivation , Vol 19, 

(pp. 179-210). Neflf York: Academic Press. 
Chase, W.G. & Simon. HA. (1973). The mind's eye In chess. In W.G. 

Chase (Ed.), Visual iijformation processing (pp. 215-281). New 

York: Academic Press. 
Cole, R A.. Rudnicky, A. L, Zue, V. W., & Reddy. D. R. (1980). Speech 

as patterns on paper. In R A. Cole (Ed.) Perception and 

productiicm ofjluent speech (pp. 4-50). Hillsdale. NJ: Erlbaum. 
Cole. R A. & Zue. V. W. (1980). Speech as eyes see it. In R S. 

Nickerson (Ekl.). Attention and performance VIU (pp. 475-494). 

Hillsdale, NJ: Erlbaum. 
Greene. E.G., Pisoni, D.B.. & CarreU. T,D. (1984). Recognition of 

speech spectrograms. Journal of the Acoustical Society of 

America, 76, 32-43. 
Kahneman. D. (1973). Attention and effort Englewood Cliffs. NJ: 

Prentice- Hall. 

Kahneman, D., & Henik, A. (1981). Perceptual organization and 
attention. In M, Kubovy 6£ J. R Pomerantz (Eds.). Perceptual 
organization ipp. 181-211). Hillsdale, NJ: Erlbaum. 

Lesgold, A., Rubinson, H.. Feltovich. P.. Glaser, R. Klopfer. D.. & 
Wang, Y. (1988). Eixpertise in a complex skill: Diagnosing X-ray 
pictures. In M. T. H. Chi. R Glaser. & M. Farr (Eds.). The 
nature of expertise, Hillsdale. NJ: Erlbaum. 

Levine, M. (1975). Hypothesis behavior by humans during 
discrimination learning. In M. Levine (Ed.), A cognitive theory of 
tsanUng: Research on hypothesis testing (pp. 181-190). Hillsdale, 
NJ: Erlbaurau (Adapted from Journal of Bxpertnvental Psychology, 
1966, 71, 331-338). 



31 



Difficulties in Learning 

28 



Ubennan. A. M.» Cooper. F. S., Shankweller. D. P.. Studdert-Kennedy, 

M. (1968). Why are speech spectrograms hard to read? 

Aimrican Annals of the Deqf, 113, 127-133. 
Medln. D. L., Wattenmaker, W. D.. & Michalski. R. S. (1987). 

Constraints and preferences in inductive learning: An 

ejq>erimentai study of human and machine performance. 

Cognitive Sdem^, 11, 299-339. 
Michalski, R S. (1983). A theory and methodology of inductive 

learning. Artyiclal imemgen(^, 20, 111-161. 
Potter. RK.. Kopp. GJl, & Green. H.C. (1947). Visible speech, .^ew 

York: D. Van Nostrand Co. Inc. 
Smith. M. J. (1982). Human factors research in passive sonar 

operating. Proceedings qf the intemational conference on man 

machine systems. London: Institute of Electrical Engineers. 
Squire, L.F. (1988). Fundamentals of radiology. Cambridge. MA: 

Harvard University Press, 
lyabasso. T. & Bower, G. H. (1968). Attention in learning. New York: 

Wiley. 

Triesman. A. (1986). Properties, parts, and objects. In K.R Boff. L. 
Kauftnan. and J.P. Thomas (Eds.). Handbook of perception and 
human performance. Volume U: Cognitive processes and 
performance. New York: Wiley. 

Zue. V. W. (impublished). Notes on spectrogram reading. Cambridge. 
MA: Massachussetts Institute of Technology. Department of 
Electrical Engineering & Computer Science and Research 
Laboratory of Electronics. 



32 



0 




Thln-Stpaight-Hlgh Large-One 



Wide-Straight-Low Large-Two 




University of Pi ttsburgh/Lesgold 



1990/01/31 



Dr, Edith Ackermann 
Media Laboratory 
E15-311 

20 Ames Street 
Cambridge, MA 02139 

Or. John R» Anderson 
Department of Psychology 
oarneg i e-^Me 1 1 on Un i vers i ty 
Schenley Park 
Pittsburgh, PA 15213 

Dr. Stephen J. Andrtole, Chairman 
Department of Information Systems 

and Systems Engineering 
George Mason University 
4400 University Drive 
Fairfax, VA 22030 

Prof • John Annett 
University of Warwick 
Department of Psychology 
Coventry CV4 7AL 
ENGLAND 

Dr. Gary Aston-Jones 

Department of Biology 

New York University 

Rm. 1009 Main B(dg. 

100 Washington Square East 

New York, NY 10003 

Dr* Patricia Baggett 
School of Education 
610 E. University, Rm 1302D 
University of Michigan 
Ann Arbor, MI 48109-1259 

Dr. Gautam Biswas 
Department of Computer Science 
Box 1688, Station B 
Vanderbilt University 
Nashvi lie, TN 37235 

Dr. Lyie E. Bourne, Jr. 
Department of Psychology 
Box 345 

University of Colorado 
Boulder, CO 80309 



Dr. Robert Calfee 
School of Education 
Stanford University 
Stanford, CA 94305 

Or. Joseph Campione 

Center for the Study of Reading 

University of II I inois 

51 Gerty Drive 

Champaign, IL 61820 

Dr. Jaime 6. Carbonefl 
Computer Science Department 
Carnegi 0-*Me j Ion University 
Schenley Park 
Pittsburgh, PA 15213 

Dr. John M. Carrol t 
IBM Watson Research Center 
User Interface Institute 
P.O. Box 704 

Yorktown Heights, NY 10598 

Dr. David E. Clement 
Department of Psychology 
University of South Carolina 
Columbia, SC 29208 

Dr, Jere Confrey 
Corne It Un i vers i ty 
Dept. of Education 
Room 490 Roberts 
Ithaca, NY 14853 

Dr. Lynn A* Cooper 
Department of Psychology 
Columbia University 
New York, NY 10027 

Dr, Thomas E. DeZern 
Project Cngtneef, AI 
Genera! Dynami cs 
PO Box 748/Mai{ Zone Zt^^o 
Fort Worth, TX ?tl01 

Dr. Ronna Dillon 
Department of Guidance and 

Eduoat i ona f Psycho 1 ouy 
Southern ill mo is University 
Carbondale^ IL 



35 



:ERIC 



Dr. J. Stu^»rt Donn 
Facuftv of Edur.atton 
University of British Cofumbia 
2125 Ham Mai! 

Vancouver, BC CANADA VST IZ5 

Defense Technical 

information Center 
Cameron Station, Bidg 5 
Alexandria, VA 22314 
(12 Cop » es> 

Dr. John ElUs 

Navy Pers'-^nne! R&D CentT?r 

Cod^f 51 

San Diego. CA 92252 

ERIC Facility-Acquisitions 
2440 Research Blvd, Suite 550 
Rockvi f ie, MD 20850-3238 

Dr» K» Anders Ericsson 
University of Colorado 
Department of Psychology 
Campus Boh 345 
Boulder, CO 80309-0345 

Dr. Marshall J, Farr, Consultant 
Cognitive 8t Instructional Sciences 
2520 North Vernon Street 
Arl ington, VA 22207 

CAPT J. Finel { i 
Commandant <6-PTE> 
U.S. Coast Guard 
2100 Second St., S.W. 
Washington, DC 20593 

Dr. Norman Frederiksen 
Educational Testing Service 

<05-R> 
Princeton, NJ 08541 



Psycho iogy Department 
George Mason University 
4400 University Drive 
Fairfax, VA 22030 

Dr. Robert Gi ^is*^r 
Learning Research 

& Development C^nt^r 
University of ^'ittsburgh 
3939 O'Hara St-eet 
Pittsburgh, 15i6l^ 

Dr . Marv in D. 6 I ock 
10 1 Home''. te ad Terrace 
Ithaca, NY 14856 

Dr. Owight J. Goehring 
ARI Field Unit 
P.O. Box 5787 

Presidio of Monter^>y, CA 93944-5011 

Dr. Sherrie Gott 
AFHRL/MOMJ 

Brooks AFB, TX 78235-5801 
Or. Wayne Gray 

Artificial Intelligence Laboratory 
NYNEX 

500 Westchester Avenue 
White Plains, NY 10604 

H, Wi H iam Greenup 

Dep Asst C/S, Instructional 

Management <E03A> 
Education Center, MCCDC 
Quantico, VA 22134-5050 

Dr. Gerhard Grossing 
Atomi nsti tut 
Schutte Istrasse 115 
Vienna 

AUSTRIA A- 1020 




Dr. Al inda Friedman 
Department of Psychology 
University of Alberta 
Edmonton, Alberta 
CANADA T66 2E9 

Dr« Robert M. Gagne 
1456 Mitchell Avenue 
Tal lahassee, FL 32303 



36 



Stevan Harnad 

Editor, The Behavioral and 

Brain Sciences 
20 Nassau Street, Suite 240 
Princeton, NJ 08542 



University of Pi ttsburgh/LesQOI d 



Or, James Htebert 
Department of Educational 

Development 
University of Delaware 
Newark, DE i97I6 

Dr. Steven A, Httlyard 
Department of Neurosci ences, M-008 
University of California, San Diego 
La Jo If a, CA 92093 

Ms. Jut la S. Hough 
Cambridge University Press 
40 West 20th Street 
New York, NY 100 U 

Dr. Mi 1 ! iam Howe 1 1 
Chief Scientist 
AFHRL/CA 

Brooks AFB, TX 78235-5601 
Z. Jaoobson 

Bureau of Management Consulting 
701-365 Laurier Ave», W, 
Ottawa, Ontario KIH 5W3 
CANADA 

Dr» Daniel B* Jones 
U*S. Nuclear Regulatory 

Commission 
NRR/ILRB 

Washington, DC 20555 

Mr. Paul L. Jones 

Research Division 

Chief of Naval Technical Training 

Bui Id i ng East-1 

Naval Air Station Memphis 

Mi I Mngton, TN 38054-5056 

Dr. Marcel Just 
Carnegie-Mellon University 
Department of Psychology 
SchenI ey Park 
Pittsburgh, PA 15213 

Dr. Daniel Kahneman 
Department of Psychology 
University of California 
Berkeley, CA 94720 



Dr. Ruth Kanfer . 
University of Minnesota 
Department of Psychology 
Elliott Hall 
75 Em River Road 
Minneapolis, MN 55455 

Dr. Michael Kaplan 
Office of Basic Research 
U.S. Army Research Institute 
5001 Eisenhower Avenue 
Alexandria, VA 22333-5600 

Dr. Mi Iton S. Kat2 
European Science Coordination 
Office 

U.S. Army Research Institute 
Box 65 

FPONew York 09510-1500 

Dr. Steven W. Keele 

Department of Psychology 

University of Oregon 

Eugene, OR 97403 

Or. Frank Kei I 
Department of Psychology 
228 Uris Hal t 
Cornel t Un i versi ty 
Ithaca, NY 14850 

Or. Wendy Kel I ogg 

IBM T. J. Watson Research Ctr* 

P.O. Box 704 

Yorktown Heights, NY 10598 

Dr. Thomas Ki I 1 ion 
AFHRL/OT 

Wijiiams AFB, AZ 85240-6457 

Dr. J, Peter Kincaid 
Army Research Institute 
Orlando Field Uni t 
c/o PM TRADE-E 
Orlando, FL 32813 

Mr. David A. Kobus 

Naval Health Research Center 

P.O. Box 85122 

San Diego, CA 92138 



37 



University of Pi ttsborgh/Lesgo I d 



1930/01/31 



Dr. Sylvan Kornblum 

University of Michigan 

Mental Health Research Institute 

205 Washtenaw Place 

Ann Arbor, MI 48109 

Dr. Gary Kress 
628 Spazier Avenue 
Pacific Grove, CA 93950 

Dr. Lois-Ann Kuntz 
3010 S.W. 23rd Terrace 
Apt. No. 105 
Gainesvi He, FL 32608 

Dr. Pat Langley 

NASA Ames Research Ctr. 

Moffett Field, CA 94035 

Dr. Robert W, Lawler 

Matthews 118 

Purdue University 

West Lafayette, IN 47907 

Dr. John Levine 
Learning R&D Center 
University of Pittsburgh 
Pittsburgh, PA 15260 

Matt Lewis 

Department of Psychology 
Carnegie-Mellon University 
Pittsburgh, PA 15213 

Dr. Robert Lloyd 

Dept. of Geography 

Un iversity of South Carolina 

Columbia, SC 29208 

Dr. Don Lyon 
P. 0. Box 44 

Higley, AZ 85236 

Dr. Mi ! i iam L. Maloy 

Code 04 

NETPMSA 

Pensacola, FL 32503-5000 

Dr. Elizabeth Martin 
AFHRL/OTE 
Wi it iams AFB 
A2 85240 



Dr. James G. May 
Department of Psychology 
University of New Orleans 
Lakef rent 

New Orleans, LA 70148 

Dr. Joseph C. McLachlan 
Code 52 

Navy Personnel R&D Center 
San Diego, CA 92152-6800 

Dr. D. Michie 

The Turing Institute 

George House 

38 North Hanover Street 

Glasgow Gl 2AD 

UNITED KINGDOM 

Dr. Al ien Munro 
Behavioral Technology 
Laboratories - USC 
250 N. Harbor Dr., Suite 309 
Redondo Beach, CA 90277 

Prof. David Navon 
Department of Psychology 
University of Haifa 
Haifa 31999 
ISRAEL 

Library, NPRDC 
Code P201L 

San Diego, CA 92152-6800 
L i brar i an 

Naval Center for Applied Research 

in Artificial Intelligence 
Naval Research Laboratory 
Code 5510 

Washington, DC 20375-5000 

Dr. Paul O'Rorke 
Information & Computer Science 
University of California, Irvine 
Irvine, CA 92717 

Dr. Stel Ian Ohisson 
Learning R&D Center 
University of Pittsburgh 
Pittsburgh, PA 15260 



38 



University of Pi ttsburgh/Lesgo I d 



Ur* janiQB D» UiS8n 


Ur. Peter Poison 




University of Colorado 


1 ft7^ ^nii-fK ^^-o^-A Q-ff'A^^'f 

lOfyj opuwn oxaxe oxreex 


Departfnent of Psychology 




Boulder, CO o0309^0345 


Urtice OT Naval Kesearcn» 


Dr. Miohael I. Posner 




UeparxtDdnt or Psychology 




university ot uregon 


At* { » rkrh4-Ar> WA 7— CnAAA 
Hr MngXOnf VH Ccl^i f ~DUvU 


tugene, UK y/4uo 


\o uopias> 




Or* Robert 8. Post 


ur» Juotth urasanu 


Psychology Department 


paste Kesearcn ufttoe 


University of California 


Army Research Institute 


Dav IS, CA 95616 


DWi Lisennower Avenue 




Mfexanorfai va c^c^^^ 


Dr. Daniel Retsberg 




Reed Co 1 1 ege 


ur m Jesse uriansKy 


Department of Psychology 


in^XiXuxe TO" uer ense Analyses 


r ort ! and 9 uk y / c\)c 


iQvi t^* DeauregarQ ox* 




Atexancriay va ccoii 


Ur. trnst Rothkopf 




ATopT Be M Laboratories 


ur. UKCnoon rark 


Room ZD*-4do 


Mrmy KeseafCn inswixuxe 


bUO nounta i n A\/enue 




Murray Hi 1 i , NJ 07374 


0 w i cisennower Hv€*nue 




Hs^xanaria^ VA ^4ioc>3 


ur. Arthur G. Samuel 




Yai e Un i vers i ty 


Ur. Key Pea 


Department of Psycho lo-^y 


Institute for Research 


Box 1IA» Yaie Station 


on L 83 r n \ ng 


New Haven, LI Ooo<:0 


4S>Dy Hanover otreet 




ralO A!tO| LA y4d04 


Dr. Walter Schneider 




Learning R&D Center 


Dr. C* Perrino, Chair 


University of Pittsburgh 


Dept» of Psychology 


3939 O'Hara Street 


Morgan State University 


Pittsburgh, PA I52b0 


uo 1 d opring ua.-HtMen So. 


Baltimore, MU 21Z39 


Dr. Hans-Wi i \ \ Schro if f 




Psychological Research 


uepx» OT Moiii 1 n 1 sxrax f ve octences 


nenket- Lorp./WbL 


Code 54 


Henkelstr. 67 


Navaf Postgraduate School 


D-4000 Duesseldorf 1 FRG 


Monterey 1 CA 93943-5026 


WEST GERMANY 


Dr. Martha Poison 


Or. Randall Shumaker 


Department of Psychology 


Naval Research Laboratory 


University of Colorado 


Code 5510 


Boulder, CO 80309-0345 


4555 Overlook Avenue, S.W. 




Washington, DC 20375-5000 



39 



University of Pi ttsburgh/Lesgold 



1990/01/31 



Dr. 2ita M. Simutis 

Chief I Technologies for Skill 



Mr* Paul T. t4ohi9 
Army Research Institute 
5001 Eisenhower Ave* 
ATTN: PERI-RL 
Alexandria, VA 22333-5600 



Acquisition and Retention 

ARI 

5001 Eisenhower Avenue 
Alexandria, VA 22333 



Or. Saul Sternberg 
University of Pennsylvania 
Department of Psychology 



Dr. Joseph L. Young 
National Science Foundation 
Room 320 



3815 Walnut Street 



1800 G Street, N.W. 
Washington, DC 20550 



Philadelphia, PA 19104-8196 

Dr* Patrick Suppes 
Stanford University 
Institute for Mathematical 

Studies in the Social Sciences 
Stanford, CA 94305-4115 

I ^ohn Tangney 
AFOSR/NL, BIdg. 410 
Boiling AFB, DC 20332-6448 

Or. Perry W. Thorndyke 
FMC Corporation 
Central Engineering Labs 
1205 Coleman Avenue, Box 580 
Santa Clara, CA 95052 

Dr. Paul T. Twohig 
Army Research Institute 

ATTN: PERI-RL 
5001 Eisenhower Avenue 
Alexandria, VA 22333-5600 

Dr. Zita E. Tyer 
Department of Psychology 
Seorge Mason University 
4400 University Drive 
Fairfax, VA 22030 

Dr. Kurt Van Lehn 
Department of Psychology 
Carnegie-Mellon University 
Schentey Park 
Pittsburgh, PA 15213 

a 

Dr. Shih-sung Men 
Department of Psychology 
Jackson State University 
1400 J. R. Lynch Street 
Jackson, MS 39217 



40